Robotics & Artificial Intelligence Tools 2003 (Professional Edition)

home *** CD-ROM | disk | FTP | other *** search

/ Robotics & Artificial Int…3 (Professional Edition) / Robotics & Artificial Intelligence Tools 2003 (Professional Edition).iso / neural network tool and application / nsinstall.exe / data1.cab / InteractiveBook_Files / Examples / Chapter1 / Rattling / TextBox1.txt < prev

Wrap

Text File | 2002-03-08 | 1.8 KB | 13 lines

#titleTextBox Rattling #subtitleTextBox Watching the Rattling #mainTextBox Set the learning rate to 0.005 and watch the weight track oscillate around the optimal value (0.3009) with values between 0.32 and 0.27 (the rattling basin). The MSE is computed as an average over the epoch (so it does not rattle). The misadjustment is the final MSE (.24) divided by the optimal MSE (.23), thus it is approximately 4%. If we decrease the learning rate to 0.001, the weight oscillation is smaller (0.303-0.292) and the misadjustment is now approximately 1%. #titleTextBox Rattling #subtitleTextBox Is the Regression Line a Line? #mainTextBox Now change the learning rate to 0.018 (critically damped/fastest convergence). Notice the error is lower than the analytical minimum (0.23). How can this be? The regression line shows the reason. As we mentioned before, the regression line is adjusting between each sample. We compute the error (and plot the results), however, on a point by point basis. Thus, the mean square error is not computed using a single set of weights (a single regression line). In on-line learning mode, it is more difficult to compute the final MSE since the weights may change significantly between samples. #titleTextBox Rattling #subtitleTextBox Summary #mainTextBox In this demonstration we have illustrated the relationship between rattling and speed of adaptation for the LMS. Large learning rates give quick convergence but large rattling, resulting in possibly large errors when we stop the training. Small learning rates give slow convergence but a good final value for the weights. A useful rule of thumb is to use the learning rate of 1/10 the largest possible. When this precaution is not taken the weights are changing so efectively the estimated system parameters are not usable. The batch mode does not have these problems.